-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-11559: [C++] Use smarter Flatbuffers verification parameters #9447
Conversation
Note that we could adopt a similar strategy when validating the Parquet Thrift structures (related: https://issues.apache.org/jira/browse/PARQUET-1877). |
cc @emkornfield |
bad3f4c
to
b485e01
Compare
Flatbuffers is able to encode a virtually unbounded of schema fields in a small buffer size. Verifying that many fields with the Flatbuffers verifier seems to result in potentially unbounded verification times, which is a denial of service risk. To mitigate the risk, impose that a Flatbuffers buffer cannot represent one more than one Flatbuffers table per buffer bit, which should always be true for well-formed Arrow IPC metadata. Indeed, the only recursive table, the `Field` table in Schema.fbs, mandates the presence of its `type` member (though it's not marked as required in the Flatbuffers definition, it's validated by the IPC read routines).
b485e01
to
b3f1e6f
Compare
To quote an answer that I got on the Flatbuffers mailing-list: a buffer of a given size can basically encode a nearly unlimited number of tables, since Flatbuffers can deduplicate tables (e.g. you can have a Field with N times the same child, itself with M times the same child, etc., which can produce a combinatory explosion). |
Hmm, I'll add the regression file as a separate PR then. |
Flatbuffers is able to encode a virtually unbounded of schema fields in a small buffer size. Verifying that many fields with the Flatbuffers verifier seems to result in potentially unbounded verification times, which is a denial of service risk. To mitigate the risk, impose that a Flatbuffers buffer cannot represent one more than one Flatbuffers table per buffer bit, which should always be true for well-formed Arrow IPC metadata. Indeed, the only recursive table, the `Field` table in Schema.fbs, mandates the presence of its `type` member (though it's not marked as required in the Flatbuffers definition, it's validated by the IPC read routines). TODO: * [ ] Add OSS-Fuzz regression file Closes apache#9447 from pitrou/ARROW-11559-fbb-verification-params Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>
Flatbuffers is able to encode a virtually unbounded of schema fields in a small buffer size. Verifying that many fields with the Flatbuffers verifier seems to result in potentially unbounded verification times, which is a denial of service risk. To mitigate the risk, impose that a Flatbuffers buffer cannot represent one more than one Flatbuffers table per buffer bit, which should always be true for well-formed Arrow IPC metadata. Indeed, the only recursive table, the `Field` table in Schema.fbs, mandates the presence of its `type` member (though it's not marked as required in the Flatbuffers definition, it's validated by the IPC read routines). TODO: * [ ] Add OSS-Fuzz regression file Closes apache#9447 from pitrou/ARROW-11559-fbb-verification-params Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>
Flatbuffers is able to encode a virtually unbounded of schema fields in a small buffer size. Verifying that many fields with the Flatbuffers verifier seems to result in potentially unbounded verification times, which is a denial of service risk. To mitigate the risk, impose that a Flatbuffers buffer cannot represent one more than one Flatbuffers table per buffer bit, which should always be true for well-formed Arrow IPC metadata. Indeed, the only recursive table, the `Field` table in Schema.fbs, mandates the presence of its `type` member (though it's not marked as required in the Flatbuffers definition, it's validated by the IPC read routines). TODO: * [ ] Add OSS-Fuzz regression file Closes apache#9447 from pitrou/ARROW-11559-fbb-verification-params Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>
Flatbuffers is able to encode a virtually unbounded of schema fields in a small buffer size. Verifying that many fields with the Flatbuffers verifier seems to result in potentially unbounded verification times, which is a denial of service risk. To mitigate the risk, impose that a Flatbuffers buffer cannot represent one more than one Flatbuffers table per buffer bit, which should always be true for well-formed Arrow IPC metadata. Indeed, the only recursive table, the `Field` table in Schema.fbs, mandates the presence of its `type` member (though it's not marked as required in the Flatbuffers definition, it's validated by the IPC read routines). TODO: * [ ] Add OSS-Fuzz regression file Closes apache#9447 from pitrou/ARROW-11559-fbb-verification-params Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>
Flatbuffers is able to encode a virtually unbounded of schema fields in a small buffer size.
Verifying that many fields with the Flatbuffers verifier seems to result in potentially unbounded verification times, which is a denial of service risk.
To mitigate the risk, impose that a Flatbuffers buffer cannot represent one more than one Flatbuffers table per buffer bit, which should always be true for well-formed Arrow IPC metadata. Indeed, the only recursive table, the
Field
table in Schema.fbs, mandates the presence of itstype
member (though it's not marked as required in the Flatbuffers definition, it's validated by the IPC read routines).TODO: